Multi armed bandit problem: some insights

ثبت نشده

چکیده

Multi Armed Bandit problems have been widely studied in the context of sequential analysis. The application areas include clinical trials, adaptive filtering, online advertising etc. The study is also characterized as a policy selection which maximizes a gambler’s reward when there are multiple slot machines that are generating them. It is under this framework, that we describe the model and develop subsequent results. Lai and Robbins [8] studied this problem under statistical settings and developed asymptotically efficient adaptive allocation rules.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Irrevocable Multiarmed Bandit Problem

This paper considers the multi-armed bandit problem with multiple simultaneous arm pulls and the additional restriction that we do not allow recourse to arms that were pulled at some point in the past but then discarded. This additional restriction is highly desirable from an operational perspective and we refer to this problem as the ‘Irrevocable Multi-Armed Bandit’ problem. We observe that na...

متن کامل

The Blinded Bandit: Learning with Adaptive Feedback

We study an online learning setting where the player is temporarily deprived of feedback each time it switches to a different action. Such model of adaptive feedback naturally occurs in scenarios where the environment reacts to the player’s actions and requires some time to recover and stabilize after the algorithm switches actions. This motivates a variant of the multi-armed bandit problem, wh...

متن کامل

Volatile Multi-Armed Bandits for Guaranteed Targeted Social Crawling

We introduce a new variant of the multi-armed bandit problem, called Volatile Multi-Arm Bandit (VMAB). A general policy for VMAB is given with proven regret bounds. The problem of collecting intelligence on profiles in social networks is then modeled as a VMAB and experimental results show the superiority of our proposed policy.

متن کامل

Online Multi-Armed Bandit

We introduce a novel variant of the multi-armed bandit problem, in which bandits are streamed one at a time to the player, and at each point, the player can either choose to pull the current bandit or move on to the next bandit. Once a player has moved on from a bandit, they may never visit it again, which is a crucial difference between our problem and classic multi-armed bandit problems. In t...

متن کامل

Cognitive Capacity and Choice under Uncertainty: Human Experiments of Two-armed Bandit Problems

The two-armed bandit problem, or more generally, the multi-armed bandit problem, has been identified as the underlying problem of many practical circumstances which involves making a series of choices among uncertain alternatives. Problems like job searching, customer switching, and even the adoption of fundamental or technical trading strategies of traders in financial markets can be formulate...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2011

Multi armed bandit problem: some insights

ثبت نشده

چکیده

منابع مشابه

The Irrevocable Multiarmed Bandit Problem

The Blinded Bandit: Learning with Adaptive Feedback

Volatile Multi-Armed Bandits for Guaranteed Targeted Social Crawling

Online Multi-Armed Bandit

Cognitive Capacity and Choice under Uncertainty: Human Experiments of Two-armed Bandit Problems

عنوان ژورنال:

اشتراک گذاری